Inferring human intrinsic rewards through inverse reinforcement learning
نویسندگان
چکیده
منابع مشابه
Learning Robust Rewards with Adversarial Inverse Reinforcement Learning
Reinforcement learning provides a powerful and general framework for decision making and control, but its application in practice is often hindered by the need for extensive feature and reward engineering. Deep reinforcement learning methods can remove the need for explicit engineering of policy or value features, but still require a manually specified reward function. Inverse reinforcement lea...
متن کاملInverse Reinforcement Learning with Simultaneous Estimation of Rewards and Dynamics
Inverse Reinforcement Learning (IRL) describes the problem of learning an unknown reward function of a Markov Decision Process (MDP) from observed behavior of an agent. Since the agent’s behavior originates in its policy and MDP policies depend on both the stochastic system dynamics as well as the reward function, the solution of the inverse problem is significantly influenced by both. Current ...
متن کاملReinforcement Learning Without Rewards
Machine learning can be broadly defined as the study and design of algorithms that improve with experience. Reinforcement learning is a variety of machine learning that makes minimal assumptions about the information available for learning, and, in a sense, defines the problem of learning in the broadest possible terms. Reinforcement learning algorithms are usually applied to “interactive” prob...
متن کاملInverse Reinforcement Learning through Policy Gradient Minimization
Inverse Reinforcement Learning (IRL) deals with the problem of recovering the reward function optimized by an expert given a set of demonstrations of the expert’s policy. Most IRL algorithms need to repeatedly compute the optimal policy for different reward functions. This paper proposes a new IRL approach that allows to recover the reward function without the need of solving any “direct” RL pr...
متن کاملInverse Reinforcement Learning through Structured Classification
This paper adresses the inverse reinforcement learning (IRL) problem, that is inferring a reward for which a demonstrated expert behavior is optimal. We introduce a new algorithm, SCIRL, whose principle is to use the so-called feature expectation of the expert as the parameterization of the score function of a multiclass classifier. This approach produces a reward function for which the expert ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Frontiers in Computational Neuroscience
سال: 2012
ISSN: 1662-5188
DOI: 10.3389/conf.fncom.2012.55.00050